Aiming at the problem of poor denoising effect and long training period in image denoising, an image denoising model based on approximate U-shaped network structure was proposed. Firstly, the original linear network structure was modified to an approximate U-shaped network structure by using convolutional layers with different strides. Then, the image information of different receptive fields was superimposed on each other to preserve the original information of the image as much as possible. Finally, the deconvolutional network layer was introduced for image restoration and further noise removal. Experimental results show that on Set12 and BSD68 test sets: compared with Denoising Convolutional Neural Network (DnCNN) model, the proposed model has an average increase of 0.04 to 0.14 dB on Peak Signal-to-Noise Ratio (PSNR), and an average reduction of 41% on training time, verifying that the proposed model has better denoising effect and shorter training time.
As the existing malicious code detection methods based on deep learning have problems of insufficiency and low accuracy of feature extraction, a malicious code detection method based on attention mechanism and Residual Network (ResNet) called ARMD was proposed. To support the training of this method, the hash values of 47 580 malicious and benign codes were obtained from Kaggle website, and the APIs called by each code were extracted by analysis tool VirusTotal. After that, the called APIs were integrated into 1 000 non-repeated APIs as the detection features, and the training sample data was constructed through these features. Then, the sample data was labeled by determining the benignity and maliciousness based on the VirusTotal analysis results, and the SMOTE (Synthetic Minority Over-sampling Technique) enhancement algorithm was used to equalize the data samples. Finally, the ResNet injecting with the attention mechanism was built and trained to complete the malicious code detection. Experimental results show that the accuracy of malicious code detection of ARMD is 97.76%, and compared with the existing detection methods based on Convolutional Neural Network (CNN) and ResNet models, ARMD has the average precision improved by at least 2%, verifying the effectiveness of ARMD.
Heuristic and machine learning based code smell detection methods have been proved to have limitations, and most of these methods focus on the common code smells. In order to solve these problems, a deep learning based method was proposed to detect three relatively rare code smells which are related to coupling, those are Intensive Coupling, Dispersed Coupling and Shotgun Surgery. First, the metrics of three code smells were extracted, and the obtained data were processed. Second, a deep learning model combining Convolutional Neural Network (CNN) and attention mechanism was constructed, and the introduced attention mechanism was able to assign weights to the metric features. The datasets were extracted from 21 open source projects, and the detection methods were validated in 10 open source projects and compared with CNN model. Experimental results show that the proposed model achieves the better performance with the code smell precisions of 93.61% and 99.76% for Intensive Coupling and Dispersed Coupling respectively, and the CNN model achieves the better results with the code smell precision of 98.59% for Shotgun Surgery.
Influence maximization is one of the important issues in social network analysis, which aims to identify a small group of seed nodes. When these nodes act as initial spreaders, information can be spread to the remaining nodes as much as possible in the network. The existing heuristic algorithms based on network topology usually only consider one single network centrality, failing to comprehensively combine node characteristics and network topology; thus, their performance is unstable and can be easily affected by the network structure. To solve the above problem, an influence maximization algorithm based on Node Coverage and Structural Hole (NCSH) was proposed. Firstly, the coverages and grid constraint coefficients of all nodes were calculated. Then the seed was selected according to the principle of maximum coverage gain. Secondly, if there were multiple nodes with the same gain, the seed was selected according to the principle of minimum grid constraint coefficient. Finally, the above steps were performed repeatedly until all seeds were selected. The proposed NCSH maintains good performance on six real networks under different numbers of seeds and different spreading probabilities. NCSH achieves 3.8% higher node coverage than to the similar NCA (Node Coverage Algorithm) on average, and 43% lower time consumption than the similar SHDD (maximization algorithm based on Structure Hole and DegreeDiscount). The experimental results show that the NCSH can effectively solve the problem of influence maximization.
Focusing on the issue that the single point of failure cannot be efficiently handled by streaming data processing system Flink, a new fault?tolerant system based on incremental state and backup, Flink+, was proposed. Firstly, backup operators and data paths were established in advance. Secondly, the output data in the data flow diagram was cached, and disks were used if necessary. Thirdly, task state synchronization was performed during system snapshots. Finally, backup tasks and cached data were used to recover calculation in case of system failure. In the system experiment and test, Flink+ dose not significantly increase the additional fault tolerance overhead during fault?free operation; when dealing with the single point of failure in both single?machine and distributed environments, compared with Flink system, the proposed system has the failure recovery time reduced by 96.98% in single?machine 8?task parallelism and by 88.75% in distributed 16?task parallelism. Experimental results show that using incremental state and backup method together can effectively reduce the recovery time of the single point of failure of the stream system and enhance the robustness of the system.
Text feature is the key part of natural language processing. Concerning the problems of high dimensionality and sparseness of text features, a text feature selection method based on Word2Vec word embedding and Genetic AlgoRithm for Biomarker selection in high-dimensional Omics (GARBO) was proposed, so as to facilitate the subsequent text classification tasks. Firstly, the data input form was optimized, and the Word2Vec word embedding method was used to transform the text into the word vectors similar to gene expression. Then, the gene expression simulated by the high-dimensional word vectors was iteratively evolved. Finally, the random forest classifier was used to classify the text after feature selection. The experiments were conducted on the Chinese comment dataset to verify the proposed method. The experimental results show that, the optimized GARBO feature selection method is effective in text feature selection, successfully reducing 300-dimensional features to 50-dimensional features with more value, and has the classification accuracy reached 88%. Compared with other filtering type text feature selection methods, the proposed method can effectively reduce the dimension of text features and improve the effect of text classification.
Aiming at the limitations of easily falling into local minimum and poor stability in simple Monkey-King Genetic Algorithm (MKGA), a MKGA by Immune Evolutionary Hybridized (MKGAIEH) was proposed. MKGAIEH divided the total population into several sub-groups. In order to make full use of the best individual (monkey-king) information of total population, the Immune Evolutionary Algorithm (IEA) was introduced to iterative calculation. In addition, for the other individuals in the sub-groups, the crossover and mutation operations were performed on the monkey-kings of sub-groups and total population. As local searches of all sub-groups were completed, the solutions of sub-groups were mixed again. As the iteration proceeds, this strategy combined the global information exchange with local search is not only to avoid the premature convergence, but also to approximate the global optimal solution with a higher accuracy. Comparison experiments on 6 test functions using MKGAIEH, MKGA, Improved MKGA (IMKGA), Bee Evolutionary Genetic Algorithm (BEGA), Algorithm of Shuffled Frog Leaping based on Immune Evolutionary Particle Swarm Optimization (IEPSOSFLA), and Common climbing Operator Genetic Algorithm (COGA) were given. The results show that the MKGAIEH can find the global optimal solutions for all 6 test functions, and the mean values and standard deviation accuracy of 5 test functions achieve the minimums with improving several orders of magnitude than those of the comparison algorithms. Therefore, MKGAIEH has the optimal searching ability and the stability all the better.
JOMP that is the OpenMP-like implementation in Java needs to be optimized, so a parallel framework, which can separate parallel logic and logic function, was proposed.The parallel framework was implemented by a parallel library named waxberry, and the parts which need to be processed parallelly were annotated and executed by using Aspect-Oriented Programming (AOP) and run-time reflection. AOP was used to separate parallel parts with core ones, and to weave them together. Run-time reflection was used to obtain the related information during the parallel execution. The library waxberry was evaluated using Java Grande Forum (JGF) benchmarks on a quad-core processor. The experimental results show that the waxberry can obtain good performance.